ML音乐型号的出现诸如Google Magenta的Musicvae现在允许我们从其他数据集中提取和复制组成功能。这些模型允许计算作曲器参数化抽象变量,如风格和情绪。通过利用这些模型并将它们与过程算法与过去几十年来组合,可以创建一个动态歌曲,该歌曲实时组成音乐以伴随互动体验。Malakai是一种工具,可以帮助用户产生不同的技能级别创建,收听,混音并分享此类动态歌曲。使用Malakai,作曲家可以创建一个可以由侦听器互动的动态歌曲
translated by 谷歌翻译
As text generated by large language models proliferates, it becomes vital to understand how humans engage with such text, and whether or not they are able to detect when the text they are reading did not originate with a human writer. Prior work on human detection of generated text focuses on the case where an entire passage is either human-written or machine-generated. In this paper, we study a more realistic setting where text begins as human-written and transitions to being generated by state-of-the-art neural language models. We show that, while annotators often struggle at this task, there is substantial variance in annotator skill and that given proper incentives, annotators can improve at this task over time. Furthermore, we conduct a detailed comparison study and analyze how a variety of variables (model size, decoding strategy, fine-tuning, prompt genre, etc.) affect human detection performance. Finally, we collect error annotations from our participants and use them to show that certain textual genres influence models to make different types of errors and that certain sentence-level features correlate highly with annotator selection. We release the RoFT dataset: a collection of over 21,000 human annotations paired with error classifications to encourage future work in human detection and evaluation of generated text.
translated by 谷歌翻译
Drawing from the resources of psychoanalysis and critical media studies, in this paper we develop an analysis of Large Language Models (LLMs) as automated subjects. We argue the intentional fictional projection of subjectivity onto LLMs can yield an alternate frame through which AI behaviour, including its productions of bias and harm, can be analysed. First, we introduce language models, discuss their significance and risks, and outline our case for interpreting model design and outputs with support from psychoanalytic concepts. We trace a brief history of language models, culminating with the releases, in 2022, of systems that realise state-of-the-art natural language processing performance. We engage with one such system, OpenAI's InstructGPT, as a case study, detailing the layers of its construction and conducting exploratory and semi-structured interviews with chatbots. These interviews probe the model's moral imperatives to be helpful, truthful and harmless by design. The model acts, we argue, as the condensation of often competing social desires, articulated through the internet and harvested into training data, which must then be regulated and repressed. This foundational structure can however be redirected via prompting, so that the model comes to identify with, and transfer, its commitments to the immediate human subject before it. In turn, these automated productions of language can lead to the human subject projecting agency upon the model, effecting occasionally further forms of countertransference. We conclude that critical media methods and psychoanalytic theory together offer a productive frame for grasping the powerful new capacities of AI-driven language systems.
translated by 谷歌翻译
Multimodal integration of text, layout and visual information has achieved SOTA results in visually rich document understanding (VrDU) tasks, including relation extraction (RE). However, despite its importance, evaluation of the relative predictive capacity of these modalities is less prevalent. Here, we demonstrate the value of shared representations for RE tasks by conducting experiments in which each data type is iteratively excluded during training. In addition, text and layout data are evaluated in isolation. While a bimodal text and layout approach performs best (F1=0.684), we show that text is the most important single predictor of entity relations. Additionally, layout geometry is highly predictive and may even be a feasible unimodal approach. Despite being less effective, we highlight circumstances where visual information can bolster performance. In total, our results demonstrate the efficacy of training joint representations for RE.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
NLP系统的解释性方法遇到了因果推断的基本问题的版本:对于给定的基础真相输入文本,我们从未真正观察到隔离模型表示对输出的因果影响所必需的反事实文本。作为回应,许多解释性方法不使用反事实文本,假设它们将是不可用的。在本文中,我们表明可以使用近似反事实来创建强大的因果解释方法,该方法可以由人类写成近似特定的反事实或简单地使用元数据指导的启发式启发式启示术进行采样。我们提案的核心是因果替代模型(CPM)。 CPM解释了一个黑框$ \ Mathcal {n} $,因为它经过培训可以具有与$ \ Mathcal {n} $相同的实际输入/输出行为,而创建可以介入的神经表示,以模拟反事实输入/$ \ MATHCAL {N} $的输出行为。此外,我们证明了$ \ Mathcal {n} $的最佳CPM在做出事实预测时性能与$ \ Mathcal {n} $相当地执行,这意味着CPM可以简单地替换$ \ Mathcal {n} $,从而导致更多信息可解释的部署模型。我们的代码可在https://github.com/frankaging/causal-proxy-model上找到。
translated by 谷歌翻译
有限混合物建模是聚类领域的一种流行方法,并且在很大程度上是由于其软聚类成员资格概率所致。但是,EM算法是适合有限混合模型的最常见算法,是许多问题的受害者。我们解决了使用有限混合模型的困扰聚类的这些问题,包括在高维情况下与局部最大值和算法速度问题相对应的解决方案的收敛。这是通过开发两种新型算法来完成的,这些算法结合了数据矩阵的光谱分解和非参数bootstrap采样方案。模拟显示了我们的算法的有效性,不仅证明了它们的灵活性,而且还证明了与其他(自举)聚类算法相比,它们避免了与局部墨西哥相对应的溶液的能力。我们的新型算法通常具有更一致的收敛标准,并且在适合有限混合模型的其他自举算法中,速度显着提高。
translated by 谷歌翻译
本文考虑了从野外单视图像中无监督的3D对象重建的问题。由于歧义性和内在的不良性,这个问题本质上难以解决,因此需要强大的正则化以实现不同潜在因素的分离。与现有的作品将明确的正规化引入目标功能不同,我们研究了一个不同的空间进行隐式正则化 - 潜在空间的结构。具体而言,我们限制了潜在空间的结构,以捕获潜在因素的拓扑因果排序(即代表因果关系作为定向无环形图)。我们首先表明,不同的因果顺序对于3D重建至关重要,然后探索几种方法以找到与任务有关的因果因素排序。我们的实验表明,潜在空间结构确实是隐式正规化,并引入了有益于重建的电感偏见。
translated by 谷歌翻译
深度学习已被证明可以准确评估“隐藏”表型,并从传统临床医生对医学成像的解释之外的医学成像中预测生物标志物。鉴于人工智能(AI)模型的黑匣子性质,应在将模型应用于医疗保健时谨慎,因为预测任务可能会因疾病和患者人群的人口统计学差异而短路。使用来自两个医疗保健系统的大超声心动图数据集,我们测试使用深度学习算法从心脏超声图像中预测年龄,种族和性别,并评估各种混杂变量的影响。我们培训了基于视频的卷积神经网络,以预测年龄,性别和种族。我们发现,深度学习模型能够确定年龄和性别,同时无法可靠地预测种族。不考虑类别之间的混淆差异,AI模型预测性别为0.85(95%CI 0.84-0.86),年龄为9.12年的平均绝对误差为9.12年(95%CI 9.00-9.25),从AUC进行竞赛, 0.63-0.71。在预测种族时,我们表明,在培训数据中调整混杂变量(性别)的比例会显着影响AUC(从0.57到0.84),而在训练性别预测模型中,调整混杂因素(Race)并未实质性更改AUC(0.81-0.83)。这表明该模型在预测种族方面的表现很大一部分可能来自AI检测到的混杂功能。进一步的工作仍然是确定与人口统计信息相关的特定成像功能,并更好地了解医学AI中人口统计学识别的风险,因为它与潜在的偏见和差异有关。
translated by 谷歌翻译
公平的机器学习研究人员(ML)围绕几个公平标准结合,这些标准为ML模型公平提供了正式的定义。但是,这些标准有一些严重的局限性。我们确定了这些正式公平标准的四个主要缺点,并旨在通过扩展性能预测以包含分配强大的目标来帮助解决这些问题。
translated by 谷歌翻译